FLB: Fast Load Balancing for Distributed-Memory Machines
نویسندگان
چکیده
This paper describes a novel compile-time list-based task scheduling algorithm for distributed-memory systems, called Fast Load Balancing (FLB). Compared to other typical list scheduling heuristics, FLB drastically reduces scheduling time complexity to O(V (log (W ) + log (P )) + E), where V and E are the number of tasks and edges in the task graph, respectively, W is the task graph width and P is the number of processors. It is proven that FLB is essentially equivalent to the existing ETF scheduling algorithm of O(W (E + V )P ) time complexity. Experiments also show that FLB performs equally to other one-step algorithms of much higher cost, such as MCP. Moreover, FLB consistently outperforms multi-step algorithms such as DSC-LLB that also have higher cost.
منابع مشابه
Dynamic Load Balancing for Raytraced Volume Rendering on Distributed Memory Machines
We present a technique for adaptive load balancing for ray traced volume rendering on distributed memory machines using hierarchical representation of volume data. Our approach partitions the image onto processors while preserving scanline coherence. Volume data is assumed replicated on each processor since our focus in this paper is to characterize computation and communication requirements fo...
متن کاملMaintaining Spatial Data Sets in Distributed-Memory Machines
We propose a distributed data structure for maintaining spatial data sets on message-passing, distributed memory machines. The data structure is based on orthogonal bisection trees and it captures relevant characteristics of parallel machines. The operations we consider include insertion, deletion, and range queries. We introduce parameters to control how much imbalance is tolerated at each pro...
متن کاملEecient Implementation of Sorting Algorithms on Asynchronous Distributed-memory Machines
The problem of merging two sequences of elements which are stored separately in two processing elements (PEs) occurs in the implementation of many existing sorting algorithms. We describe eecient algorithms for the merging problem on asynchronous distributed-memory machines. The algorithms reduce the cost of the merge operation and of communication, as well as partly solving the problem of load...
متن کاملFsc: a Sisal Compiler for Both Distributed-and Shared-memory Machines Fsc: a Sisal Compiler for Both Distributed-and Shared-memory Machines
This paper describes a prototype Sisal compiler that supports distributed-as well as shared-memory machines. The compiler, fsc, modiies the code-generation phase of the optimizing Sisal compiler, osc, to use the Filaments library as a run-time system. Filaments eeciently supports ne-grain parallelism and a shared-memory programming model. Using ne-grain threads makes it possible to implement re...
متن کاملE cient Implementation of Sorting Algorithms on Asynchronous Distributed-Memory Machines
The problem of merging two sequences of elements which are stored separately in two processing elements (PEs) occurs in the implementation of many existing sorting algorithms. We describe e cient algorithms for the merging problem on asynchronous distributed-memory machines. The algorithms reduce the cost of the merge operation and of communication, as well as partly solving the problem of load...
متن کامل